Effects of High-Order Co-occurrences on Word Semantic Similarities
نویسندگان
چکیده
A computational model of the construction of word meaning through exposure to texts is built in order to simulate the effects of co-occurrence values on word semantic similarities, paragraph by paragraph. Semantic similarity is here viewed as association. It turns out that the similarity between two words W1 and W2 strongly increases with a co-occurrence, decreases with the occurrence of W1 without W2 or W2 without W1, and slightly increases with high-order co-occurrences. Therefore, operationalizing similarity as a frequency of co-occurrence probably introduces a bias: first, there are cases in which there is similarity without co-occurrence and, second, the frequency of co-occurrence overestimates similarity.
منابع مشابه
Effects of High-Order Co-occurrences on Word Semantic Similarity
A computational model of the construction of word meaning through exposure to texts is built in order to simulate the effects of co-occurrence values on word semantic similarities, paragraph by paragraph. Semantic similarity is here viewed as association. It turns out that the similarity between two words W1 and W2 strongly increases with a co-occurrence, decreases with the occurrence of W1 wit...
متن کاملA New Document Embedding Method for News Classification
Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...
متن کاملWord clustering effect on vocabulary learning of EFL learners: A case of semantic versus phonological clustering
The aim of this study is to determine the effect of word clustering method on vocabulary learning of Iranian EFL learners through a case of semantic versus phonological clustering. To this effect, 80 homogeneous students from four intermediate classes at an English institute in Torbat e Heydariyeh participated in this research. They were assigned to four groups according to semantic versus phon...
متن کاملCo-word maps and topic modeling: A comparison using small and medium-sized corpora (N < 1, 000)
Induced by “big data,” “topic modeling” has become an attractive alternative to mapping cowords in terms of co-occurrences and co-absences using network techniques. Does topic modeling provide an alternative for co-word mapping in research practices using moderately sized document collections? We return to the word/document matrix using first a single text with a strong argument (“The Leiden Ma...
متن کاملAcquiring Word Similarities with Higher Order Association Mining
We present a novel approach to mine word similarity in Textual Case Based Reasoning. We exploit indirect associations of words, in addition to direct ones for estimating their similarity. If word A co-occurs with word B, we say A and B share a first order association between them. If A co-occurs with B in some documents, and B with C in some others, then A and C are said to share a second order...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/0804.0143 شماره
صفحات -
تاریخ انتشار 2006